Text-dependent pathological voice detection
نویسندگان
چکیده
While global characteristics of the speaker’s source and spectral features have been successfully employed in pathological voice detection, the underlying text has largely been ignored. In this work, we focus on experiments that exploit the text stimulus that is read by the subject. Features derived from text include the mean cepstral distortion of the subject from an average intelligible speaker, and prosodic features include the speaking rate, statistics of phoneme durations, etc. The phonetic labeling information is also exploited to ignore all the unvoiced regions of the speech samples to improve the discriminability between intelligible and pathological voices. We also designed features that capture the speaker’s overall closeness to intelligible instances of the same text stimulus from other speakers. Our experiments show that the proposed text-derived features improve the detection of pathological voices by 20%.
منابع مشابه
Automatic detection of voice impairments from text-dependent running speech using a discriminative approach
Most of the vocal and voice diseases cause changes in the acoustic voice signal. Acoustic analysis is a useful tool to diagnose this kind of diseases, furthermore it presents several advantages: it is a non-invasive tool, an objective diagnostic and, also, it can be used for the evaluation of surgical and pharmacological treatments and rehabilitation processes. Most of the approaches found in t...
متن کاملAutomatic age detection in normal and pathological voice
Systems that automatically detect voice pathologies are usually trained with recordings belonging to population of all ages. However such an approach might be inadequate because of the acoustic variations in the voice caused by the natural aging process. In top of that, elder voices present some perturbations in quality similar to those related to voice disorders, which make the detection of pa...
متن کاملVoice pathology detection based eon short-term jitter estimations in running speech.
In this paper, we investigate the use of jitter estimation over short time intervals (short-term jitter) for voice pathology detection in the case of running or continuous speech. Short-term jitter estimations are provided by the spectral jitter estimator (SJE), which is based on a mathematical description of the jitter phenomenon. The SJE has been shown to be robust against errors in pitch per...
متن کاملAnti-spoofing, Voice Conversion
Voice conversion is a process which converts or transforms one speaker’s voice towards that of another. The literature shows that voice conversion can be used to spoof or fool an automatic speaker verification system. State-of-the-art voice conversion algorithms can produce high-quality speech signals in real time and are capable of fooling both human listeners and automatic systems, including ...
متن کاملSpeech detection for text-dependent speaker verification
The performance of text-dependent speaker verification systems degrades in noisy environment and when the true speaker utters words that are not part of the verification password. Energy-based voice activity detection (VAD) algorithms cannot distinguish between the true speaker’s speech and other background speech or between the speaker’s verification password and other words uttered by the spe...
متن کامل